Suffix Trays and Suffix Trists: Structures for Faster Text Indexing
نویسندگان
چکیده
منابع مشابه
Compressed Suffix Arrays and Suffix Trees with Applications to Text Indexing and String Matching∗
The proliferation of online text, such as found on the World Wide Web and in online databases, motivates the need for space-efficient text indexing methods that support fast string searching. We model this scenario as follows: Consider a text T consisting of n symbols drawn from a fixed alphabet Σ. The text T can be represented in n lg |Σ| bits by encoding each symbol with lg |Σ| bits. The goal...
متن کاملFaster Compressed Suffix Trees for Repetitive Text Collections
Recent compressed suffix trees targeted to highly repetitive text collections reach excellent compression performance, but operation times in the order of milliseconds. We design a new suffix tree representation for this scenario that still achieves very low space usage, only slightly larger than the best previous one, but supports the operations within microseconds. This puts the data structur...
متن کاملFaster suffix sorting
We propose a fast and memory efficient algorithm for lexicographically sorting the suffixes of a string, a problem that has important applications in data compression as well as string matching. Our algorithm eliminates much of the overhead of previous specialized approaches while maintaining their robustness for all kinds of input. For input size n, our algorithm operates in only two integer a...
متن کاملDotted Suffix Trees A Structure for Approximate Text Indexing
In this work, we address is text indexing for approximate matching. Given a text T which undergoes some preprocessing to generate an index, we can later query this index to identify the places where a string occurs up to a certain number of errors k (edition distance). The indexing structure occupies space O(n log n) in the average case, independent of alphabet size. This structure can be used ...
متن کاملFaster Sparse Suffix Sorting
The sparse suffix sorting problem is to sort b = o(n) arbitrary suffixes of a string of length n using o(n) words of space in addition to the string. We present an O(n) time Monte Carlo algorithm using O(b log b) space and an O(n log b) time Las Vegas algorithm using O(b) space. This is a significant improvement over the best prior solutions by Bille et al. (ICALP 2013): a Monte Carlo algorithm...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Algorithmica
سال: 2014
ISSN: 0178-4617,1432-0541
DOI: 10.1007/s00453-013-9860-6